Goto

Collaborating Authors

 grammar learning


ToddlerBERTa: Exploiting BabyBERTa for Grammar Learning and Language Understanding

Cagatan, Omer Veysel

arXiv.org Artificial Intelligence

We present ToddlerBERTa, a BabyBERTa-like language model, exploring its capabilities through five different models with varied hyperparameters. Evaluating on BLiMP, SuperGLUE, MSGS, and a Supplement benchmark from the BabyLM challenge, we find that smaller models can excel in specific tasks, while larger models perform well with substantial data. Despite training on a smaller dataset, ToddlerBERTa demonstrates commendable performance, rivalling the state-of-the-art RoBERTa-base. The model showcases robust language understanding, even with single-sentence pretraining, and competes with baselines that leverage broader contextual information. Our work provides insights into hyperparameter choices, and data utilization, contributing to the advancement of language models.


Grammar Learning by a Self-Organizing Network

Neural Information Processing Systems

This paper presents the design and simulation results of a self(cid:173) organizing neural network which induces a grammar from exam(cid:173) ple sentences. Input sentences are generated from a simple phrase structure grammar including number agreement, verb transitiv(cid:173) ity, and recursive noun phrase construction rules. The network induces a grammar explicitly in the form of symbol categorization rules and phrase structure rules. The purpose of this research is to show that a self-organizing network with a certain structure can acquire syntactic knowledge from only positive (i.e. There has been research on supervised neural network models of language acquisi(cid:173) tion tasks [Elman, 1991, Miikkulainen and Dyer, 1988, John and McClelland, 1988].


Grammar Learning by a Self-Organizing Network

Negishi, Michiro

Neural Information Processing Systems

Michiro Negishi Dept. of Cognitive and Neural Systems, Boston University 111 Cummington Street Boston, MA 02215 email: negishi@cns.bu.edu Abstract This paper presents the design and simulation results of a selforganizing neural network which induces a grammar from example sentences. Input sentences are generated from a simple phrase structure grammar including number agreement, verb transitivity, and recursive noun phrase construction rules. The network induces a grammar explicitly in the form of symbol categorization rules and phrase structure rules. 1 Purpose and related works The purpose of this research is to show that a self-organizing network with a certain structure can acquire syntactic knowledge from only positive (i.e. There has been research on supervised neural network models of language acquisition tasks [Elman, 1991, Miikkulainen and Dyer, 1988, John and McClelland, 1988]. Unlike these supervised models, the current model self-organizes word and phrasal categories and phrase construction rules through mere exposure to input sentences, without any artificially defined task goals.


Grammar Learning by a Self-Organizing Network

Negishi, Michiro

Neural Information Processing Systems

Michiro Negishi Dept. of Cognitive and Neural Systems, Boston University 111 Cummington Street Boston, MA 02215 email: negishi@cns.bu.edu Abstract This paper presents the design and simulation results of a selforganizing neural network which induces a grammar from example sentences. Input sentences are generated from a simple phrase structure grammar including number agreement, verb transitivity, and recursive noun phrase construction rules. The network induces a grammar explicitly in the form of symbol categorization rules and phrase structure rules. 1 Purpose and related works The purpose of this research is to show that a self-organizing network with a certain structure can acquire syntactic knowledge from only positive (i.e. There has been research on supervised neural network models of language acquisition tasks [Elman, 1991, Miikkulainen and Dyer, 1988, John and McClelland, 1988]. Unlike these supervised models, the current model self-organizes word and phrasal categories and phrase construction rules through mere exposure to input sentences, without any artificially defined task goals.